Introduction
This notebook provides examples
New Package
We need a new package, nngeo, that can be installed with the following script:
# install.packages("nngeo")
Dependencies
This notebook requires the following packages
# tidyverse packages
library(dplyr) # data wrangling
# spatial packages
library(mapview) # preview spatial data
library(nngeo) # eliminiate holes
library(sf) # spatial tools
# other packages
library(here) # file path management
Load Data
This notebook requires three files:
# precinct data
precinct <- st_read(here("data", "example-data", "POL_WRD_2010_Prec", "POL_WRD_2010_Prec.shp"),
stringsAsFactors = FALSE) %>%
st_transform(crs = 26915)
Reading layer `POL_WRD_2010_Prec' from data source `/Users/prenercg/GitHub/slu-soc5650/lecture-C/data/example-data/POL_WRD_2010_Prec/POL_WRD_2010_Prec.shp' using driver `ESRI Shapefile'
Simple feature collection with 233 features and 5 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: 871512.3 ymin: 982403 xmax: 915268.6 ymax: 1070966
proj4string: +proj=tmerc +lat_0=35.83333333333334 +lon_0=-90.5 +k=0.9999333333333333 +x_0=249999.9999999999 +y_0=0 +datum=NAD83 +units=us-ft +no_defs
# COVID zip code data
city <- st_read(here("data", "example-data", "daily_snapshot_stl_city.geojson"),
crs = 4326, stringsAsFactors = FALSE) %>%
st_transform(crs = 26915)
Reading layer `daily_snapshot_stl_city' from data source `/Users/prenercg/GitHub/slu-soc5650/lecture-C/data/example-data/daily_snapshot_stl_city.geojson' using driver `GeoJSON'
Simple feature collection with 28 features and 11 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -90.32052 ymin: 38.53185 xmax: -90.16657 ymax: 38.77443
CRS: EPSG:4326
county <- st_read(here("data", "example-data", "daily_snapshot_stl_county.geojson"),
crs = 4326, stringsAsFactors = FALSE) %>%
st_transform(crs = 26915)
Reading layer `daily_snapshot_stl_county' from data source `/Users/prenercg/GitHub/slu-soc5650/lecture-C/data/example-data/daily_snapshot_stl_county.geojson' using driver `GeoJSON'
Simple feature collection with 48 features and 11 fields
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -90.73653 ymin: 38.3883 xmax: -90.11771 ymax: 38.89118
CRS: EPSG:4326
Dissolving Features
In our preinct data, we have a variable named WARD10. This is the City Ward that each preinct falls within. If we wanted to map wards instead of precincts, we can modify our geometric data using group_by() and summarise():
precinct %>%
select(WARD10) %>%
rename(ward = WARD10) %>%
group_by(ward) %>%
summarise() -> ward
Once these have been dissolved, we can explore them with mapview():
mapview(ward)
Notice how some wards, such as Ward 4 and Ward 21 in North City, Ward 6 and Ward 7 in Downtown, and Wards 12, 15, and 23 in South City have “holes.” These are common artifacts of the dissolve process that result from precincts’ geometries not perfectly abutting each other.
The nngeo package has a great function st_remove_holes() that can be used to get rid of these:
ward <- st_remove_holes(ward)
We can check out the differences with mapview():
mapview(ward)
Be careful with removing holes, particularly if your features have enclaves in them (as Kansas City does) - those enclaves will get removed as well, and st_difference() will have to be used to cut the enclaves back out!
Merging Features
Sometimes, we get data that we want to use in separate files, such as the city and county COVID data (current as of 2020-04-19). We can use rbind() to combine them.
region <- rbind(city, county)
Be sure to check them first to make sure they have the same names/types of columns to prevent issues with your bound data!
Once these have been merged, we can explore them with mapview():
mapview(region, zcol = "zip")
Notice that zip codes that lie along the city-county boundary are split. We can use the dissolve features workflow to combine them!
region %>%
select(zip, confirmed) %>%
group_by(zip) %>%
summarise(confirmed = sum(confirmed, na.rm = TRUE)) -> region
Once these have been dissolved, we can explore them with mapview():
mapview(region, zcol = "zip")
Dealing with Geometry Collections
Sometimes when we geoprocess data, we get a mix of geometry types. This occurs with the region data, where we get a mix of "POLYGON" and "MULTIPOLYGON" features:
st_is(region, "POLYGON")
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[23] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE
[45] FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
This is known as a “geometry collection.” We can check the attributes of the geometry column in region to confirm this:
attributes(region$geometry)$class
[1] "sfc_GEOMETRY" "sfc"
We see "sfc_GEOMETRY" but expect either "sfc_POLYGON" or "sfc_MULTIPOLYGON" instead. We can convert to polygon using:
region <- st_collection_extract(region, "POLYGON")
Once these have been merged, we can explore them with mapview():
mapview(region, zcol = "zip")
LS0tCnRpdGxlOiAiTGVjdHVyZS1DIEV4YW1wbGUgTm90ZWJvb2sgLSBDb21wbGV0ZSIKYXV0aG9yOiAiQ2hyaXN0b3BoZXIgUHJlbmVyLCBQaC5ELiIKZGF0ZTogJyhgciBmb3JtYXQoU3lzLnRpbWUoKSwgIiVCICVkLCAlWSIpYCknCm91dHB1dDogCiAgZ2l0aHViX2RvY3VtZW50OiBkZWZhdWx0CiAgaHRtbF9ub3RlYm9vazogZGVmYXVsdCAKLS0tCgojIyBJbnRyb2R1Y3Rpb24KVGhpcyBub3RlYm9vayBwcm92aWRlcyBleGFtcGxlcyAKCiMjIE5ldyBQYWNrYWdlCldlIG5lZWQgYSBuZXcgcGFja2FnZSwgYG5uZ2VvYCwgdGhhdCBjYW4gYmUgaW5zdGFsbGVkIHdpdGggdGhlIGZvbGxvd2luZyBzY3JpcHQ6CgpgYGB7ciBpbnN0YWxsLW5uZ2VvLCBldmFsID0gRkFMU0V9CiMgaW5zdGFsbC5wYWNrYWdlcygibm5nZW8iKQpgYGAKCiMjIERlcGVuZGVuY2llcwpUaGlzIG5vdGVib29rIHJlcXVpcmVzIHRoZSBmb2xsb3dpbmcgcGFja2FnZXMKCmBgYHtyIGxvYWQtcGFja2FnZXN9CiMgdGlkeXZlcnNlIHBhY2thZ2VzCmxpYnJhcnkoZHBseXIpICAgICMgZGF0YSB3cmFuZ2xpbmcKCiMgc3BhdGlhbCBwYWNrYWdlcwpsaWJyYXJ5KG1hcHZpZXcpICAjIHByZXZpZXcgc3BhdGlhbCBkYXRhCmxpYnJhcnkobm5nZW8pICAgICMgZWxpbWluaWF0ZSBob2xlcwpsaWJyYXJ5KHNmKSAgICAgICAjIHNwYXRpYWwgdG9vbHMKCiMgb3RoZXIgcGFja2FnZXMKbGlicmFyeShoZXJlKSAgICAgIyBmaWxlIHBhdGggbWFuYWdlbWVudApgYGAKCiMjIExvYWQgRGF0YQpUaGlzIG5vdGVib29rIHJlcXVpcmVzIHRocmVlIGZpbGVzOgoKYGBge3IgbG9hZC1kYXRhfQojIHByZWNpbmN0IGRhdGEKcHJlY2luY3QgPC0gc3RfcmVhZChoZXJlKCJkYXRhIiwgImV4YW1wbGUtZGF0YSIsICJQT0xfV1JEXzIwMTBfUHJlYyIsICJQT0xfV1JEXzIwMTBfUHJlYy5zaHAiKSwgCiAgICAgICAgICAgICAgICAgICAgc3RyaW5nc0FzRmFjdG9ycyA9IEZBTFNFKSAlPiUKICBzdF90cmFuc2Zvcm0oY3JzID0gMjY5MTUpCgojIENPVklEIHppcCBjb2RlIGRhdGEKY2l0eSA8LSBzdF9yZWFkKGhlcmUoImRhdGEiLCAiZXhhbXBsZS1kYXRhIiwgImRhaWx5X3NuYXBzaG90X3N0bF9jaXR5Lmdlb2pzb24iKSwgCiAgICAgICAgICAgICAgICBjcnMgPSA0MzI2LCBzdHJpbmdzQXNGYWN0b3JzID0gRkFMU0UpICU+JQogIHN0X3RyYW5zZm9ybShjcnMgPSAyNjkxNSkKCmNvdW50eSA8LSBzdF9yZWFkKGhlcmUoImRhdGEiLCAiZXhhbXBsZS1kYXRhIiwgImRhaWx5X3NuYXBzaG90X3N0bF9jb3VudHkuZ2VvanNvbiIpLCAKICAgICAgICAgICAgICAgICAgY3JzID0gNDMyNiwgc3RyaW5nc0FzRmFjdG9ycyA9IEZBTFNFKSAlPiUKICBzdF90cmFuc2Zvcm0oY3JzID0gMjY5MTUpCmBgYAoKIyMgRGlzc29sdmluZyBGZWF0dXJlcwpJbiBvdXIgYHByZWluY3RgIGRhdGEsIHdlIGhhdmUgYSB2YXJpYWJsZSBuYW1lZCBgV0FSRDEwYC4gVGhpcyBpcyB0aGUgQ2l0eSBXYXJkIHRoYXQgZWFjaCBwcmVpbmN0IGZhbGxzIHdpdGhpbi4gSWYgd2Ugd2FudGVkIHRvIG1hcCB3YXJkcyBpbnN0ZWFkIG9mIHByZWNpbmN0cywgd2UgY2FuIG1vZGlmeSBvdXIgZ2VvbWV0cmljIGRhdGEgdXNpbmcgYGdyb3VwX2J5KClgIGFuZCBgc3VtbWFyaXNlKClgOgoKYGBge3IgZGlzc29sdmUtd2FyZH0KcHJlY2luY3QgJT4lCiAgc2VsZWN0KFdBUkQxMCkgJT4lCiAgcmVuYW1lKHdhcmQgPSBXQVJEMTApICU+JQogIGdyb3VwX2J5KHdhcmQpICU+JQogIHN1bW1hcmlzZSgpIC0+IHdhcmQKYGBgCgpPbmNlIHRoZXNlIGhhdmUgYmVlbiBkaXNzb2x2ZWQsIHdlIGNhbiBleHBsb3JlIHRoZW0gd2l0aCBgbWFwdmlldygpYDoKCmBgYHtyIGV4cGxvcmUtd2FyZH0KbWFwdmlldyh3YXJkKQpgYGAKCk5vdGljZSBob3cgc29tZSB3YXJkcywgc3VjaCBhcyBXYXJkIDQgYW5kIFdhcmQgMjEgaW4gTm9ydGggQ2l0eSwgV2FyZCA2IGFuZCBXYXJkIDcgaW4gRG93bnRvd24sIGFuZCBXYXJkcyAxMiwgMTUsIGFuZCAyMyBpbiBTb3V0aCBDaXR5IGhhdmUgImhvbGVzLiIgVGhlc2UgYXJlIGNvbW1vbiBhcnRpZmFjdHMgb2YgdGhlIGRpc3NvbHZlIHByb2Nlc3MgdGhhdCByZXN1bHQgZnJvbSBwcmVjaW5jdHMnIGdlb21ldHJpZXMgbm90ICpwZXJmZWN0bHkqIGFidXR0aW5nIGVhY2ggb3RoZXIuCgpUaGUgYG5uZ2VvYCBwYWNrYWdlIGhhcyBhIGdyZWF0IGZ1bmN0aW9uIGBzdF9yZW1vdmVfaG9sZXMoKWAgdGhhdCBjYW4gYmUgdXNlZCB0byBnZXQgcmlkIG9mIHRoZXNlOgoKYGBge3IgcmVtb3ZlLXdhcmQtaG9sZXN9CndhcmQgPC0gc3RfcmVtb3ZlX2hvbGVzKHdhcmQpCmBgYAoKV2UgY2FuIGNoZWNrIG91dCB0aGUgZGlmZmVyZW5jZXMgd2l0aCBgbWFwdmlldygpYDoKCmBgYHtyIGNoZWNrLXdhcmR9Cm1hcHZpZXcod2FyZCkKYGBgCgpCZSBjYXJlZnVsIHdpdGggcmVtb3ZpbmcgaG9sZXMsIHBhcnRpY3VsYXJseSBpZiB5b3VyIGZlYXR1cmVzIGhhdmUgZW5jbGF2ZXMgaW4gdGhlbSAoYXMgS2Fuc2FzIENpdHkgZG9lcykgLSB0aG9zZSBlbmNsYXZlcyB3aWxsIGdldCByZW1vdmVkIGFzIHdlbGwsIGFuZCBgc3RfZGlmZmVyZW5jZSgpYCB3aWxsIGhhdmUgdG8gYmUgdXNlZCB0byBjdXQgdGhlIGVuY2xhdmVzIGJhY2sgb3V0IQoKIyMgTWVyZ2luZyBGZWF0dXJlcwpTb21ldGltZXMsIHdlIGdldCBkYXRhIHRoYXQgd2Ugd2FudCB0byB1c2UgaW4gc2VwYXJhdGUgZmlsZXMsIHN1Y2ggYXMgdGhlIGBjaXR5YCBhbmQgYGNvdW50eWAgQ09WSUQgZGF0YSAoY3VycmVudCBhcyBvZiAyMDIwLTA0LTE5KS4gV2UgY2FuIHVzZSBgcmJpbmQoKWAgdG8gY29tYmluZSB0aGVtLiAKCmBgYHtyIG1lcmdlfQpyZWdpb24gPC0gcmJpbmQoY2l0eSwgY291bnR5KQpgYGAKCkJlIHN1cmUgdG8gY2hlY2sgdGhlbSBmaXJzdCB0byBtYWtlIHN1cmUgdGhleSBoYXZlIHRoZSBzYW1lIG5hbWVzL3R5cGVzIG9mIGNvbHVtbnMgdG8gcHJldmVudCBpc3N1ZXMgd2l0aCB5b3VyIGJvdW5kIGRhdGEhCgpPbmNlIHRoZXNlIGhhdmUgYmVlbiBtZXJnZWQsIHdlIGNhbiBleHBsb3JlIHRoZW0gd2l0aCBgbWFwdmlldygpYDoKCmBgYHtyIGV4cGxvcmUtcmVnaW9ufQptYXB2aWV3KHJlZ2lvbiwgemNvbCA9ICJ6aXAiKQpgYGAKCk5vdGljZSB0aGF0IHppcCBjb2RlcyB0aGF0IGxpZSBhbG9uZyB0aGUgY2l0eS1jb3VudHkgYm91bmRhcnkgYXJlIHNwbGl0LiBXZSBjYW4gdXNlIHRoZSBkaXNzb2x2ZSBmZWF0dXJlcyB3b3JrZmxvdyB0byBjb21iaW5lIHRoZW0hCgpgYGB7ciBkaXNzb2x2ZS16aXBzfQpyZWdpb24gJT4lCiAgc2VsZWN0KHppcCwgY29uZmlybWVkKSAlPiUKICBncm91cF9ieSh6aXApICU+JQogIHN1bW1hcmlzZShjb25maXJtZWQgPSBzdW0oY29uZmlybWVkLCBuYS5ybSA9IFRSVUUpKSAtPiByZWdpb24KYGBgCgpPbmNlIHRoZXNlIGhhdmUgYmVlbiBkaXNzb2x2ZWQsIHdlIGNhbiBleHBsb3JlIHRoZW0gd2l0aCBgbWFwdmlldygpYDoKCmBgYHtyIGV4cGxvcmUtZGlzc29sdmVkLXJlZ2lvbn0KbWFwdmlldyhyZWdpb24sIHpjb2wgPSAiemlwIikKYGBgCgojIyBEZWFsaW5nIHdpdGggR2VvbWV0cnkgQ29sbGVjdGlvbnMKU29tZXRpbWVzIHdoZW4gd2UgZ2VvcHJvY2VzcyBkYXRhLCB3ZSBnZXQgYSBtaXggb2YgZ2VvbWV0cnkgdHlwZXMuIFRoaXMgb2NjdXJzIHdpdGggdGhlIGByZWdpb25gIGRhdGEsIHdoZXJlIHdlIGdldCBhIG1peCBvZiBgIlBPTFlHT04iYCBhbmQgYCJNVUxUSVBPTFlHT04iYCBmZWF0dXJlczoKCmBgYHtyIGdlb21ldHJ5LXR5cGVzfQpzdF9pcyhyZWdpb24sICJQT0xZR09OIikKYGBgCgpUaGlzIGlzIGtub3duIGFzIGEgImdlb21ldHJ5IGNvbGxlY3Rpb24uIiBXZSBjYW4gY2hlY2sgdGhlIGF0dHJpYnV0ZXMgb2YgdGhlIGBnZW9tZXRyeWAgY29sdW1uIGluIGByZWdpb25gIHRvIGNvbmZpcm0gdGhpczoKCmBgYHtyIGdlb21ldHJ5LWF0dHJpYnV0ZXN9CmF0dHJpYnV0ZXMocmVnaW9uJGdlb21ldHJ5KSRjbGFzcwpgYGAKCldlIHNlZSBgInNmY19HRU9NRVRSWSJgIGJ1dCBleHBlY3QgZWl0aGVyIGAic2ZjX1BPTFlHT04iYCBvciBgInNmY19NVUxUSVBPTFlHT04iYCBpbnN0ZWFkLiBXZSBjYW4gY29udmVydCB0byBwb2x5Z29uIHVzaW5nOgoKYGBge3IgY29sbGVjdGlvbi1leHRyYWN0fQpyZWdpb24gPC0gc3RfY29sbGVjdGlvbl9leHRyYWN0KHJlZ2lvbiwgIlBPTFlHT04iKQpgYGAKCk9uY2UgdGhlc2UgaGF2ZSBiZWVuIG1lcmdlZCwgd2UgY2FuIGV4cGxvcmUgdGhlbSB3aXRoIGBtYXB2aWV3KClgOgoKYGBge3IgZXhwbG9yZS1maW5hbH0KbWFwdmlldyhyZWdpb24sIHpjb2wgPSAiemlwIikKYGBgCgoKCgpgYGB7ciBtb3ZlLXRvLWRvY3MsIGluY2x1ZGU9RkFMU0V9CiMgeW91IGRvIG5lZWQgdG8gaW5jbHVkZSB0aGlzIGluIGFueSBub3RlYm9vayB5b3UgY3JlYXRlIGZvciB0aGlzIGNsYXNzCmZzOjpmaWxlX2NvcHkoaGVyZTo6aGVyZSgiZXhhbXBsZXMiLCAibGVjdHVyZS1DLWNvbXBsZXRlLm5iLmh0bWwiKSwgCiAgICAgICAgICAgICAgaGVyZTo6aGVyZSgiZG9jcyIsICJpbmRleC5uYi5odG1sIiksIAogICAgICAgICAgICAgIG92ZXJ3cml0ZSA9IFRSVUUpCmBgYAoK